600 research outputs found

    Attention Mechanisms for Object Recognition with Event-Based Cameras

    Full text link
    Event-based cameras are neuromorphic sensors capable of efficiently encoding visual information in the form of sparse sequences of events. Being biologically inspired, they are commonly used to exploit some of the computational and power consumption benefits of biological vision. In this paper we focus on a specific feature of vision: visual attention. We propose two attentive models for event based vision: an algorithm that tracks events activity within the field of view to locate regions of interest and a fully-differentiable attention procedure based on DRAW neural model. We highlight the strengths and weaknesses of the proposed methods on four datasets, the Shifted N-MNIST, Shifted MNIST-DVS, CIFAR10-DVS and N-Caltech101 collections, using the Phased LSTM recognition network as a baseline reference model obtaining improvements in terms of both translation and scale invariance.Comment: WACV2019 camera-ready submissio

    Analysis of patients with chronic cerebro-spinal venous insufficiency and multiple sclerosis: identification of parameters of clinical severity

    Get PDF
    The aims of this study were: i) analysis of clinical severity evolution in multiple sclerosis patients; ii) identification of temporal indicators for clinical worsening. We investigated by echo-color-Doppler (ECD) 789 patients (490 female plus 299 male), aged 45.4 years, with chronic cerebro-spinal venous insufficiency (CCSVI) and multiple sclerosis (MS). All patients tested positive for CCSVI by ECD assessment were divided into three groups, namely: type 1 CCSVI (371) presenting an endo-vascular obstacle to the venous drainage; type 2 CCSVI (40) presenting an extra-vascular obstacle to the venous drainage, for external compression of the vessel; type 3 CCSVI (315) presenting both venous endo-vascular and extra-vascular obstructed drains. We analyzed the morphological and hemodynamic data recorded on computerized map (MEM-net). All data were collected by respecting the Italian Privacy Laws and they are available on the National Epidemiological Observatory on CCSVI website (www.osservatorioccsvi.org). We focused in the three main parameters in all studied patients. First parameter was expanded disability status scale (EDSS) score; second parameter was illness duration; third parameter was CCSVI type. The MS duration values stratified by EDSS grouped values in CCSVItype- 1 and CCSVI-type-3 patients shows that the differences were statistical significant by Kruskal-Wallis test: H=44.2829; degree of freedom= 1 for CCSVI-type-1 (P<0.001); and H=37.3036; degree of freedom=1 for CCSVItype- 3 (P<0.001). The present study confirmed and completed scientific literature about relation between CCSVI and MS. On the same time, we found a strong correlation between MS illness duration and severity of EDSS score. In fact there is a clinical severity worsening after 11 years of illness in MS patients with CCSVI type-1 or type-3 (P<0.001). These data may suggest the influence of chronic vascular disease on MS. Further searches need in order to learn more about this new aspect in MS etiology

    NAIS-Net: Stable Deep Networks from Non-Autonomous Differential Equations

    Get PDF
    This paper introduces Non-Autonomous Input-Output Stable Network (NAIS-Net), a very deep architecture where each stacked processing block is derived from a time-invariant non-autonomous dynamical system. Non-autonomy is implemented by skip connections from the block input to each of the unrolled processing stages and allows stability to be enforced so that blocks can be unrolled adaptively to a pattern-dependent processing depth. NAIS-Net induces non-trivial, Lipschitz input-output maps, even for an infinite unroll length. We prove that the network is globally asymptotically stable so that for every initial condition there is exactly one input-dependent equilibrium assuming tanh units, and multiple stable equilibria for ReL units. An efficient implementation that enforces the stability under derived conditions for both fully-connected and convolutional layers is also presented. Experimental results show how NAIS-Net exhibits stability in practice, yielding a significant reduction in generalization gap compared to ResNets.Comment: NIPS 201

    Multi-View Stereo with Single-View Semantic Mesh Refinement

    Get PDF
    While 3D reconstruction is a well-established and widely explored research topic, semantic 3D reconstruction has only recently witnessed an increasing share of attention from the Computer Vision community. Semantic annotations allow in fact to enforce strong class-dependent priors, as planarity for ground and walls, which can be exploited to refine the reconstruction often resulting in non-trivial performance improvements. State-of-the art methods propose volumetric approaches to fuse RGB image data with semantic labels; even if successful, they do not scale well and fail to output high resolution meshes. In this paper we propose a novel method to refine both the geometry and the semantic labeling of a given mesh. We refine the mesh geometry by applying a variational method that optimizes a composite energy made of a state-of-the-art pairwise photo-metric term and a single-view term that models the semantic consistency between the labels of the 3D mesh and those of the segmented images. We also update the semantic labeling through a novel Markov Random Field (MRF) formulation that, together with the classical data and smoothness terms, takes into account class-specific priors estimated directly from the annotated mesh. This is in contrast to state-of-the-art methods that are typically based on handcrafted or learned priors. We are the first, jointly with the very recent and seminal work of [M. Blaha et al arXiv:1706.08336, 2017], to propose the use of semantics inside a mesh refinement framework. Differently from [M. Blaha et al arXiv:1706.08336, 2017], which adopts a more classical pairwise comparison to estimate the flow of the mesh, we apply a single-view comparison between the semantically annotated image and the current 3D mesh labels; this improves the robustness in case of noisy segmentations.Comment: {\pounds}D Reconstruction Meets Semantic, ICCV worksho

    ReConvNet: Video Object Segmentation with Spatio-Temporal Features Modulation

    Full text link
    We introduce ReConvNet, a recurrent convolutional architecture for semi-supervised video object segmentation that is able to fast adapt its features to focus on any specific object of interest at inference time. Generalization to new objects never observed during training is known to be a hard task for supervised approaches that would need to be retrained. To tackle this problem, we propose a more efficient solution that learns spatio-temporal features self-adapting to the object of interest via conditional affine transformations. This approach is simple, can be trained end-to-end and does not necessarily require extra training steps at inference time. Our method shows competitive results on DAVIS2016 with respect to state-of-the art approaches that use online fine-tuning, and outperforms them on DAVIS2017. ReConvNet shows also promising results on the DAVIS-Challenge 2018 winning the 1010-th position.Comment: CVPR Workshop - DAVIS Challenge 201

    Asynchronous Convolutional Networks for Object Detection in Neuromorphic Cameras

    Get PDF
    Event-based cameras, also known as neuromorphic cameras, are bioinspired sensors able to perceive changes in the scene at high frequency with low power consumption. Becoming available only very recently, a limited amount of work addresses object detection on these devices. In this paper we propose two neural networks architectures for object detection: YOLE, which integrates the events into surfaces and uses a frame-based model to process them, and fcYOLE, an asynchronous event-based fully convolutional network which uses a novel and general formalization of the convolutional and max pooling layers to exploit the sparsity of camera events. We evaluate the algorithm with different extensions of publicly available datasets and on a novel synthetic dataset.Comment: accepted at CVPR2019 Event-based Vision Worksho

    Improving Generalization in Federated Learning by Seeking Flat Minima

    Get PDF
    Models trained in federated settings often suffer from degraded performances and fail at generalizing, especially when facing heterogeneous scenarios. In this work, we investigate such behavior through the lens of geometry of the loss and Hessian eigenspectrum, linking the model's lack of generalization capacity to the sharpness of the solution. Motivated by prior studies connecting the sharpness of the loss surface and the generalization gap, we show that i) training clients locally with Sharpness-Aware Minimization (SAM) or its adaptive version (ASAM) and ii) averaging stochastic weights (SWA) on the server-side can substantially improve generalization in Federated Learning and help bridging the gap with centralized models. By seeking parameters in neighborhoods having uniform low loss, the model converges towards flatter minima and its generalization significantly improves in both homogeneous and heterogeneous scenarios. Empirical results demonstrate the effectiveness of those optimizers across a variety of benchmark vision datasets (e.g. CIFAR10/100, Landmarks-User-160k, IDDA) and tasks (large scale classification, semantic segmentation, domain generalization)

    FedDrive v2: an Analysis of the Impact of Label Skewness in Federated Semantic Segmentation for Autonomous Driving

    Full text link
    We propose FedDrive v2, an extension of the Federated Learning benchmark for Semantic Segmentation in Autonomous Driving. While the first version aims at studying the effect of domain shift of the visual features across clients, in this work, we focus on the distribution skewness of the labels. We propose six new federated scenarios to investigate how label skewness affects the performance of segmentation models and compare it with the effect of domain shift. Finally, we study the impact of using the domain information during testing.Comment: 5th Italian Conference on Robotics and Intelligent Machines (I-RIM) 202

    Window-based Model Averaging Improves Generalization in Heterogeneous Federated Learning

    Full text link
    Federated Learning (FL) aims to learn a global model from distributed users while protecting their privacy. However, when data are distributed heterogeneously the learning process becomes noisy, unstable, and biased towards the last seen clients' data, slowing down convergence. To address these issues and improve the robustness and generalization capabilities of the global model, we propose WIMA (Window-based Model Averaging). WIMA aggregates global models from different rounds using a window-based approach, effectively capturing knowledge from multiple users and reducing the bias from the last ones. By adopting a windowed view on the rounds, WIMA can be applied from the initial stages of training. Importantly, our method introduces no additional communication or client-side computation overhead. Our experiments demonstrate the robustness of WIMA against distribution shifts and bad client sampling, resulting in smoother and more stable learning trends. Additionally, WIMA can be easily integrated with state-of-the-art algorithms. We extensively evaluate our approach on standard FL benchmarks, demonstrating its effectiveness.Comment: International Conference on Computer Vision Workshop (ICCVW

    Italian Chronic Cerebrospinal Venous Insufficiency National Epidemiological Observatory methodology and preliminary data

    Get PDF
    The aim of our work is to describe the Memnet program's use and potential and to show the data of Italian Chronic Cerebrospinal Venous Insufficiency (CCSVI)-National Epidemiological Observatory (NEO) activity in the first three years (http://www.osservatorioccsvi. org). From 2011 to 2014, all echo-color- Doppler (ECD) assessments were stored by Mem-net program into CCSVI-NEO web site (http://www.mem-net.it). Mem-net is a tool for multicenter data collection based on the International Society for Neurovascular Disease consensus and position statement, where we can insert patients (pts) history, neurological visits, ECD assessments, different examinations, therapies and surgical procedures. The website provides an epidemiological and statistical program for data analysis in real time. At present, 7 medical centers, affiliated to CCSVI-NEO, input their symptomatic and asymptomatic subjects with CCSVI. Data were storage using the Mem-net program. We analyzed data of only four centers on seven (Rome, Bari, Cagliari and Benevento). Total pts number with multiple sclerosis (MS) was 1109, mean age 46.0±13.4 [male 422 (38.05%); female 687 (61.95%)]. CCSVI positive pts were 937 (84.49%), CCSVI negative pts were 172 (15.51%). The CCSVI type 1 subjects were 530 (56.56%), CCSVI type 2 subjects were 20 (2.13%), CCSVI type 3 subjects were 387 (41.30%). We found 800 (85.38%) pts with criterion 1; 725 (77.37%) with criterion 2; 519 (55.39%) with criterion 3; 483 (51.55%) with criterion 4; 88 (9.39%) with criterion 5. The venous hemodynamic insufficiency severity score mean score was 3.8; the CCSVI mean score was 2.8; the MEM mean score was 34.7; the expanded disability status scale mean score was 4.5; the disease mean duration was 12.5±5.7 years. MS clinical types were divided as follows: relapsing-remitting pts were 449 (47.92%), Secondary progressive pts were 144 (15.37%), primary progressive pts were 72 (7.68%). The CCSVI-NEO database and Memnet software may be useful medical and researching tools for recording, storing, analyzing and studying ECD and vascular data. Preliminary data of NEO show an elevated prevalence of CCSVI in MS
    • …
    corecore